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Abstract 



This article aims to clarify the case and the mechanism where sanction or pun- 
ishment by institution can deliver the evolution of cooperation. Compared to peer 
sanctioning, institutional sanctioning may be sensitive to players' attitude toward 
players who do not pre-commit punishment. Departed from former studies based on 
the punisher who always acts cooperatively, we assume that the punishing player is 
skeptical in that she cooperates in proportion to how many same types join in her 
team. Relying on stochastic adaptive dynamics, we show that institutional sanction- 
ing coupled with skeptical punisher can make cooperation evolve for the case where 
peer sanctioning may not. 
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1 Introduction 



In theory and practice, sanctioning misbehaviors is core and integrated part in dehvering 
the evolution of cooperation. If defectors are not restricted, they are to increase, which 
ultimately leads to the demise of cooperation. In preventing defectors from thriving, the 
role of punishers is considered to be critical, who take their own sacrifice to sanction 
defectors. However, punishing players tend to be evolutionarily inferior to defectors since 
the punishing is hard to outperform the punished. For this reason, many studies based on 
evolutionary game theory and its dynamics has been developed to illustrate how punishers 
can survive despite its evolutionary disadvantage, and make cooperation evolve in a society. 

This paper touches another aspect of sanctioning, which is not treated frequently in 
related theories. When we observe some kind of defectors in societal entities, what are our 
reactions to them? Some may just pass them over, and some may punish them directly; 
scold them or put some sort of physical actions on them. Another way to punish defection 
is resorting to some institution such as police or higher ranks that make defection down 
as a representative of the general good, which we obey. Most of theoretical researches on 
punishment has conventionally assumed peer sanctioning where players punish defectors 
directly, which leaves the intriguing issues of institutional sanctioning intact. Based on 
the methodology of stochastic evolutionary dynamics, this article explores when and how 
such institutional sanctioning makes cooperation evolve in simple theoretical setting. Our 
main argument consists of two parts: 

1) Commitment problem is important when institutional sanctioning is applied. If pre- 
commitment is possible by paying ex ante some cost of sanctioning, commitment can 
be done credibly. The public information on commitment level may affect the strategic 
choice for players, especially punishers who already pay their bill. We show that 
skepticism of punishers would play a crucial role in making cooperation in a team 
when it is coupled with institutional sanctioning. This skepticism equips player with 
the ability to defend itself from unconditional defection and/or to exploit unconditional 
cooperation. Although our skepticism may not be directly translated into selfishness, 
the behavior of our skeptical punisher is partly considered to be selfish. In this sense, 
our model implies that players' selfishness does not always disturb the evolution of 
cooperation. 

2) In contrast to former studies where sufficient intensity of punishment is assumed for 
peer sanctioning, our model works well for less intense range of punishment. This 
implies that the solution by institution can be complementary to that by direct types 
such as peer punishment. Considering many real- world circumstances that direct and 
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harsh punishment cannot be readily implemented, institutional sanction may fit for 
this case. 

The organization of the paper is following. Section 2 reviews former studies related 
to our argument. Section 3 elaborates the setup of the model. Section 4 provides two 
logics of evolutionary process that show when and how institutional sanctioning can be 
effective. Section 5 summarizes the gist of the paper, and makes some comments on future 
researching agendas. 

2 Related Studies 

The first research that inspired this paper is the evolution of cooperation "via freedom 
to coercion." Based on stochastic evolutionary dynamics, Hauert et al. (2007) show that 
evolutionary dilemma on the origin and the stability of punishment can be solved when 
lone interaction is introduced. The loner exits her team, and gets a fixed payoff unrelated 
to others' strategic choice. This loner fixates defectors, and all of the loner are fixated 
again by the cooperators and the (cooperative) punishcrs. This evolutionary history ends 
up with prevailing cooperative state. We would suggest another route for the evolution of 
cooperation, which works without introducing such lone interaction. 

The second is the paradoxical role of "selfish" punishment in the evolution of coopera- 
tion, which is disposed to punish other defectors even though she acts defectively. Eldakar 
et al. (2007) and Eldakar and Wilson (2008) assume that strategic choice can be separated 
from the act of punishment, and show that selfish act can make the evolution of coopera- 
tion. Instead of pure selfishness postulated in these studies, we introduce the skepticism 
that makes players choose their strategy in a team based on the information of punishing 
commitment. 

The last one is the peculiarity that institutional sanctioning has compared to peer 
sanctioning (Yamagishi, 1986; Giirerk et al., 2006; Kosfeld et al., 2009; Sigmund et al., 
2010). Unlike most of studies that presume peer sanctioning, we introduce institutional 
sanction that punishment is done via some authority over individual players. With peer 
sanctioning, the cost of sanctioning can change according to the size of defectors in a team, 
which may make pre-commitment of sanctioning unbinding. Institutional sanctioning can 
be credibly committed by paying some costs ex ante before the choice of strategy. Sigmund 
et al. (2010) shows that institution may deliver the evolution of cooperation when second- 
order punishment is to be tackled. By assuming players' utilization of information of 
commitment, our research investigates more basic and elementary aspect of institutional 
sanctioning, which makes cooperation evolve in a straight way. 
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3 Setups for Sanction by Institution 



Public good game with committing stage 

Our basic framework is a simple game, (linear) public good game (PGG) of G > 3 size. 
We consider a well-mixed population of constant size M S> G, and G individuals are 
randomly selected and offered the option to participate in PGG. Each can contribute for 
the public good or not; cooperate (C) or defect (D). For simplicity, players invest a fixed 
amount 1 normalized. The contributions of cooperators in a team are multiplied by 
r > 2, and then divided among all G participants. The payoff for each C and D is given 

by 

' X 

r-^ for D 

G 

- 1 for C. 

G 

To integrate commitment on punishment in our model, the game proceeds on three 
stages at each team level. 

1) Committing stage: Players can pay 7 to establish an sanctioning institution. If a 
player pays this cost, she can pre-commit to agree sanctioning defective players. All 
the participants in a team know this information. 

2) Contributing stage: Players participate in PGG described above, and obtain their 
payoff. 

3) Sanctioning stage: Finally, sanctioning mechanism works to punish defective players. 
Sanctioning technology is implemented in such way that each defector is equally pun- 
ished by sanctioning institution established in first stage. Net after sanction is the final 
payoff for each player. 

Skepticism of punisher 

The punishing players (P) in former studies tend to be naive, who acts cooperatively and 
punishes others when she observes defectors in her team. Wc call her cooperative punisher 
(CP). For them, committing stage is redundant, for they cooperate anyway regardless of 
information on commitment. 

In comparison to CP, we propose a more skeptical type of punisher who chooses her 
strategy based on information from committing stage. We call her skeptical punisher 
(SP), who cooperates in proportion to the level of commitment in her team (Rustagi 
et al., 2010). Although other players' strategies in her team are not known to her, the 
information known to SP in a team indicates the intensity of punishment against defection 
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in sanctioning stage. When the punishment is weak, defection can be beneficial for a player 
since she can exploit unconditional cooperators or defend herself best against unconditional 
defectors. In sum, SP is the punisher who is sensitive to the information on commitment 
in choosing her strategy. 

Sanctioning mechanism 

The credibility of commitment depends on sanctioning mechanism. To make commitment 
credible, institutional sanction is to be introduced, which is that each punisher in a team 
pays a fixed amount to form a local institution to police her team. This institution is 
used to punish defectors in sanctioning stage. The collected total for punishment in a 
team, ^yXp, is the total cost of punishment, j3xp/Xjy is imposed on each defector where 
/3 > 7 is sanctioning technology, and and numbers of D and P in a team 

respectively. The ^Xp is lost for nothing if there be no defectors in a team. On this 
account, institutional sanctioning can incur social cost when a team consists only of SP. 

Payoffs 

With institutional sanctioning, payoffs Vi for i € {C, D, P} have four parts; benefits from 
cooperation in a team that are equally shared, investment in PGG when the player chooses 
cooperation, sanction that is inflicted by peer or institution, and cost of sanction incurred 
if she make commitment for sanctioning institution. After normalizing investment as 1, 
payoffs of institutional sanctioning with CP are given by 
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1-7, 

where r is the beneficiary multiplier for PGG. Total population consists of each n. for 
i G {C, D, P] with n^+n^+np= M} 

In the case of SP, the probability that SP cooperates is assumed to be simply 5p [Xp ) := 

X —1 

, which is that SP minds the level of commitment by others in her team to choose 
her strategy. Payoffs of institutional sanctioning with SP are given by 
^For peer sanctioning, := r ^c+'^p _ p^.^^ .— ^ x^+xp _ _ ^ 
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4 Stochastic Imitation Dynamics 

This section discusses two versions of stochastic dynamics where imitation is used as social 
learning for players. The one is designed for intuitive and analytic purpose; the other is 
implemented for more general and precise validation of our argument. As is mentioned, 
only three types, C, D and P are to be cast in our scenario. The loner who plays a critical 
role in Hauert et al. (2007) is excluded on purpose to illustrate an alternative route to 
cooperation without the bypass such as lone interaction. 

Simple Imitation Dynamics 

When all of players are composed of one type, this state is absorbing in that imitation or 
adaptive dynamics would make no change. Namely, each player cannot learn from others 
in population. Wc propose a simple imitative dynamics that is heuristic for investigating 
evolutionary dynamics for our discussion. Particularly, this dynamics is nice to be handled 
since sampling complication of more generalized processes such as Moran process can be 
simplified without losing its implications. At first, we put three assumptions to model our 
simple adaptive process. 

1) Adiabatic stochastic process: When mutation or innovation is introduced, each state 
can be overturned by these invaders or remain unchanged by the disappearance of 
them. Resident players who watch invading type are quick to change their strategy 
if the payoff of invaders is better than theirs. If the resident is better than mutants 
in payoff, imitation makes mutants disappear. Assuming that such innovations are 
extremely rare, imitation works much faster than innovation. That is, next mutation 
always happens after learning process ends up to a homogeneous state. This process 
can be called "adiabatic" since a newly introduced mutation ends up with extinction 
of this type or with its fixation. These processes can be nicely described by a simple 
Markov chain with the same number of states of possible players' types (Taylor et al., 
2004; Fudenberg and Imhof, 2006; Sigmund, 2010). 

2) Multiple mutants: Instead of assuming a single invader, we propose that multiple 
mutants of a type is introduced in a homogeneous state. First, single mutant is not 
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proper to consider more general and complex evolutionary process like Moran process. 
In Moran process, a mutant that is worse than the resident may not be immediately 
extinct in the sampling process. This can be partly modeled by approving multiple 
mutants of a type. Next, mutant type can spring in a group instead of single one if 
a single mutant has some degree of extra influence over other residents. The size of 
mutation by type k is denoted by /Xfe > 2 for k G {C, D,P}. 1 < //fe < G is assumed, 
which is that the size of mutation is not too massive. 
3) Simultaneous imitation by a universal model: When homogeneous state is perturbed 
by the group of a mutant type, our simple imitation process works. After a session 
of interaction ends, each imitates a universal model who is chosen by its payoff. The 
choice of the model is based on the size of payoff. If there exists a tie among some 
of players, one among them are randomly chosen. If there exists a tie among all of 
players, imitation follows neutral drift where imitation is done by a randomly chosen 
model. 

Simple Markov transition matrix can be obtained for three stationary states, which can 
be used in calculating invariant distribution among three states. This transition matrix 
is given by 

\ 4>PC (t>PD ^-4>CP- 4>OP^ j 

where (p.. denotes the fixation probability that the absorbing state that consists all of i 
(all-i state) is overturned to all-j state by invading type j for i,j € {C, Fixation 
probabilities can be determined in a very simple way. is given by multiplying the 
probability of random mutations and the transition probability by imitation when 
mutation rate goes to zero (Fudenbcrg and Imhof, 2006). When the population stay 
a homogeneous state, this can be perturbed by one type of /i--sizcd mutants. As is 
mentioned, after mutants spring, two types of players compare the payoff of their own 
with that of the other. For our simple imitation dynamics, each fixation probability is 
given by 



{pJ^'i ■ 1 if 7r,,(/x„ M - /xj > 7r,,(/x„ M - 

{pji^i ■ if TT,^. M - < 7r^,(^„ M - (3) 
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where ir^^ is the expected payoff of i against j when /v,.-sizcd i, (M — ^.)-sizcd j with all 
other types extinct for i,j G {C,D,P}, and is the probability that a mutant chooses 
the type i among set of types. ^ For the interaction among C, D and P, is given by 1/2 
where a mutant randomly choose one from alternatives in a homogeneous state. Third 
case of (3) follows the neutral drift where the transition probability by /x./M.^ 

For the class of irreducible transition matrix, invariant distribution among three states 
can be uniquely given by the eigenvector of the largest eigenvalue, 1, in our case (Fudenberg 
and Imhof, 2006). Irreducible transition matrices for this case, however, are not suitable 
to investigate our problem analytically because the burden of calculation is hard to be 
handled. Thus, we investigate an extreme case of reducible transition matrix, where the 
full cooperation is realized. Following lemma shows those cases. 

Lemma 1. Assuming 2 < r < G, the fully cooperative state where D disappears is realized 
if and only if 

Lemma 1 tells that the full cooperation can be realized only in P-all state if 2 < r < G.^ 
The absorbing state of P cannot be overturned by invasion of C and D, and P can fixate 
D-a\l state. For the case of r > G that the team is excessively productive, the evolution 
of cooperation may not be serious issue because C can fixate D-all state. As a matter of 
fact, there is no need to introduce P for this case. This is why we restrict our attention 
to the case of 2 < r < G. Following proposition shows that the evolution of cooperation 
is not delivered when CP is introduced with sanctioning institution. 

Proposition 1. For the interaction among C, D and P, sanction by institution coupled 
with CP cannot deliver the fully cooperative state.^ 

As is the case of CP with sanction by peer (Hauert et al., 2007), sanctioning institution 
cannot make evolution of cooperation for the interaction among C, D and P. Intuitively, 
CP with sanctioning by institution is always invaded by C because benefits between two 
types are same but P has already paid set-up fee for sanctioning institution. Hence 

T^cpif^c) > T^Pcif^c)- 

Corollary 1.1. Institutional sanctioning with CP cannot make the evolution of coopera- 
tion when the loner (L) is introduced. 

^Appendix B describes the exact definition of ir^. . 

^For the case of neutral drift, a mutant enjoys a same payoff with the resident. This mutant can fixate 
the population when she is randomly chosen for the model to imitate, which happens by the probability 
of nJM. 

*The proof is in Appendix A. 

^The proof is in Appendix B. 
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The proof of Corollary 1.1 is trivial. Brandt ct al. (2006) shows that L makes evolu- 
tionary cycle when it is introduced to the interaction between C and D. The interaction 
among C, D, L and P makes the evolution of cooperation only if C cannot fixate P. For 
peer sanctioning C and P enjoy the same payoff, which makes neutral drift between two. 
But, as institutional sanctioning with CP makes C fixate P, the cyclical dynamics among 
C, D, L and P emerges, which is similar to the interaction among C, D and L. 

The result so far implies that institutional sanctioning cannot make its way when P 
does not utilize information of committing stage. So to speak, CP casted in most of evo- 
lutionary studies cannot validate our sanctioning institution. Proposition 1 is intriguingly 
modified when SP who takes advantage of the information comes in. 

Proposition 2. Let us assume that finite M S> G. The fully cooperative state can be 
delivered by institutional sanction with suspicious punisher (SP) if 1) 2 mutants are in- 
troduced, and 2) the intensity and the cost of punishment are smaller than the properly 
given.^ 

Proposition 2 shows that sanctioning institution works nicely when it is coupled with 
SP. At first, the first condition in the proposition illustrates that invading of SP into Z)-all 
state can be done with two mutants springing. Imagine that a single SP is introduced in 
-D-all state. As there is no contribution, SP always defects, and payoffs from contributing 
stage between SP and D are same. However, as SP pays the setup cost of institution, 
'^poif^p = 1) < "^Dpil^p — When two mutants exist in total population, the chance 
that those two are teamed up in a same group can open the door for SP to invade into 
-D-all state. 

The second is intriguing since it indicates the condition of /3 and 7, the effectiveness and 
the cost of punishment, where institutional sanction works well. Former studies show that 
peer sanction with exit option can deliver the evolution of cooperation if the punishment 
is sufficiently effective, and the cost of it is affordable. This is to prevent D from fixating 
P-all state, which can be done only when punishment is sufficiently effective. Our result 
shows that institutional sanctioning coupled with SP loses its power when punishment 
is too harsh. Heuristically, SP is not unconditional cooperator but opportunistic in the 
sense that she would take advantage of non-committers. It is noted that institutional 
sanction is also applied to her if she acts defectively. Too effective punishment may harm 
SP seriously, which can hinder SP from thriving in population. 

We can tell that increasing jj would make more favorable condition for the fully coop- 
erative state, which is that the proper range of /3 and 7 expands. Following proposition 
shows more general results by multiple mutants, /x > 2. 

''The proof is in Appendix C. 
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Proposition 3. Let us assume that finite M ^ G. The fully cooperative state can be 
delivered by institutional sanction with suspicious punisher (SP) if 1) fi>2 mutants are 
introduced, and 2) the intensity and the cost of punishment are smaller than the properly 
given J 

Corollary 3.1. The range for (3 and 7 for the fully cooperative state 1) increases as ji 
gets larger, and 2) decrease as M gets larger.^ 

The second part of Corollary 3.1 implies that our institutional sanctioning works only 
when M is not too large. This is interesting compared to the case of peer sanctioning with 
four types, C, D, L and CP. For any proper P and 7, according to our simple dynamics, 
the frequency of cooperative state is given simply by ^^^^p^, which converges to 1 as M 
get larger.^ 

As is mentioned, /i-sized mutants of SP should successfully infiltrate or at least make 
neutral drift to the homogeneous state of D when the fully cooperative state is realized. 
This relies on the sampling odd that more than one SP are selected in a team, which 
decreases as M grows. If the institution cannot adjust its working on /3 and 7, well- 
operated sanctioning institution can turn to be obsolete when the size of total population 
increases. Limiting case of M ^ 00, institutional sanctioning cannot be helpful in the 
evolution of full cooperation for arbitrarily given (5 and 7. 

Moran Process 

Now, we would extend previous results of institutional sanction with SP to more general 
and complex adaptive process. This would show that former simple imitation dynamics 
can gain wider applicability for the cases where analytic approach cannot be approved. 

The Moran process is a natural way to go in studying stochastic evolutionary dynam- 
ics. Moran process is a classical model of population dynamics, which is developed in 
population genetics, and has been imported to game theory recently. In every time step 
an individual is randomly chosen for reproduction by its fitness, and makes a single clone 
that replace a randomly selected other member. The sampling for imitation based on 
payoffs is continued until the population ends up with a homogeneous state. Fixation 
probabilities under Moran process are given by 

^The Proof is in Appendix D. 

®The RHS of the proof in Appendix D directly proves this corollary. 

"Detail derivation is in Appendix E 

^'^See Fudenberg et al. (2004) for detail theoretical exposition on Moran Process. 
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where tt.. (m) is the expected payoff of i type with j type when the population consists 
of m-sized i type and (M — m)-sized j type where i,j G {C, D, P}.^^ In Moran process, 
payoffs are adjusted by s to prevent them from turning negative where 1 is the basehne 
payoff, and s is the intensity of selection, which cannot be higher than 1/(1 — minvr.^). As 
all of fixation probabilities are positive, transition matrix is irreducible, which means that 
any state of i can be reached by starting j ^ i}"^ We make cases of invariant distribution 
under standard Moran process by parameters properly given. 

Claim 1. For Moran process, institutional sanction with suspicious punisher can make 
the evolution of cooperation for the low intensity and the low cost of punishment properly 
given. 



Figure 1: The evolution of cooperation by institutional sanctioning with SP for Moran 
process. Values are obtained by standard Moran process, and parameters are M = 100, 
G = 5, r = 3 and s = 0.3. A computer program generates values by 0.01 step for /3 and 7. 
COP is the sum of frequency where population stays in C and P. (a) shows the range of 
/3 and 7 that makes fully cooperative state. For starker area, (j)pj^ > 1/M. (b) draws the 
contour map, and the gray scale represents the frequency of cooperative state from black 
{COP = 0) to white {COP = 1). The scale of (3 and 7 is adjusted for visualization. 

Figure 1 exemplifies Claim 1. For small /3 and 7 properly given, the evolution of 

cooperation is realized by institutional sanctioning coupled with SP. For starker area in 

^^See Appendix B for exact formulation for tv-^ (m). Consult Traulsen and Hauert (2008) for the friendly 
derivation of (4). 

^^Transition matrix is trivially aperiodic and recurrent, thus invariant distribution exists. 
"Calculating modules are written by MATHEMATICA 8.0 of Wolfram, Inc. 
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(a) Evolution of cooperation 



(b) Contour plot 
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(a) of Figure 1 where (pp^^ > 1/M, the speed of fixation from D to SP is fast enough. This 
ensures stabiHty of cooperative state. Intriguingly, former results by our simple imitation 
dynamics fairly resemble those by the Moran process that has more general and complex 
formulation for players' imitation process. 
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Figure 2: The increase of s on the frequency of each type for standard Moran process. 
Parameters are r = 3, 7 = 0.05, G = 5, M = 100, and Maximum s is 0.705. As /? 
increases, the level of s that unravel the cooperative state decreases. 



The intensity of selection, s, also affects working of institutional sanction. Figure 2 
illustrates that the effect of s on the frequency of cooperative state changes its direction 
by some s. 
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Figure 3: The increase of M on the frequency of cooperation. Parameters are r = 3, 
7 = 0.05, s = 0.3 and G = 5. When M is larger than a certain level, institutional 
sanctioning cannot deliver the evolution of cooperation. Moran process makes similar 
condition to former simple imitation dynamics for /3 and 7 to diliver the evolution of 
cooperation. 



Corollary 3.1 shows that the size of M has negative effect on the frequency of cooper- 
ative state for given /3 and 7. Moran process for institutional sanctioning replicates this 
result. Figure 3 illustrates that the increase of M unravels cooperative state. For given (3 
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and 7, as the size of M increases, the frequency that the population stays in cooperative 
state drops abruptly at a certain level of M. This implies that, for Moran process, our 
institutional sanctioning works effectively within certain level of M. 

5 Concluding Remarks 

We've examined the evolution of cooperation in the context of how punishment is done. 
Considering that most of punishments tend to involve institution, our study fills the gap 
in researches, which assumes conventional peer sanctioning. Main lessons of this paper 
are as follows. 

1) Institution can be justified when the efficacy and the cost of sanctioning both are not 
too large when they are compared with those of sanctioning by peer. This is reasonable 
result considering that real-world institution and its sanctioning details. 

2) The work of institution depends on players' skepticism that watch carefully signals from 
commitment to determine their strategics. Our implication is similar to the insight 
that is provided by studies of tag-based evolution Riolo et al. (2001). Ours, however, 
makes more sense economically than tag-based evolution since issue of institution and 
commitment is explicitly integrated. 

Sanctioning by institution can provide one of key aspects that the study on the evolu- 
tion of cooperation should explore. This paper suggests one simple route that institutional 
sanctioning affects the evolutionary process among different types of players and the evo- 
lution of cooperation. Actually, the theoretical setting of this paper is so heuristic that 
other interesting problems that institutions can embrace are omitted from the considera- 
tion. Further studies on many details of institution may enrich the research of evolutionary 
game theory and its dynamics on the evolution of cooperation. 

Appendix A Proof of Lemma 1 

It is easy to show that > and (f)f-,jy = are satisfied when < r < G. With this, other 
fixation probabilities are to be determined that reducible transition matrix ends up with 
the full cooperative state. Generally, invariant distribution of reducible transition matrix 
depends on initial condition, but we can find the condition for the fully cooperative state. 
^PD > ^'^d (j)jyp = should hold to prevent D from appearing in invariant distribution, 
and (f)f-,p = hold to prevent the cyclic movement among C, D and P. ■ 
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Appendix B Proof of Proposition 1 



When fj, mutants springs in M-sized population, they can be grouped as PGG teams in 
different ways. This process can be calculated by considering hyper-geometric function. 
Let us denote that //^ = /Lt = 2 for G {C,D,P}. Expected payoff of a representative 
type i against type j that is sampled in G-sized PGG groups are given by 



, , v-^ ^ " /VG— n — 1/,, , , ^ ™ /VG— n — 1/,, , , 



n,=0 



/M-l\ /M-l\ 
{G-IJ n.=0 {O-lJ 



n^=0 I G-1 J n.=G-ii-l I G-1 J 



where V. (n.) := y(n; + 1, G — n. — 1, n^, =0), n. is the number of type i excluding type-i 
one considered, the number of type j is G — n^ — 1, and the number oi k ^ i,j type, n^^., 
is for i,j e {C,D,P}. 

The proposition can be easily proved by showing that the first part of the lemma 
cannot be satisfied. It is sufficient to show that = cannot hold. As is assumed, P is 
CP. The payoff of invading /x-sized C against resident P and that of resident P against 
invading C are given by 



When 7 > 0, 7r^p(/x^) > vrpp(/Up) for any positive fi^. Thus, all the resident P turn 
into C by imitation. Fixation probability of C in P-all state, (p^p, is given by (1/2)'^ • 1. 
(j)^p is always positive, which violates Lemma 1. ■ 



Appendix C Proof of Proposition 2 

We check the first conditions of Lemma 1. For (f>^^ = 0, some calculation can show that 
'^coil^c) ~ '^Dc(/^c) given by — 1, which is negative for r < G. Thus, this 

condition is easily satisfied for any positive fi^. Other three conditions of the first line of 
Lemma 1 are simplifed and arranged by condition of (3 and 7. All of three conditions are 
linear with respect to r, and are given by 
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T^cpil^c) -'^pcM = A+ (C.l) 

It is easy to show that > 0, < 0, and B^ < 0. To sufficiently satisfy other three 
conditions in the lemma, 



0<-^(^r,)<r (C.2) 

(C.2) is that r is to be properly defined if P can fixate D. (C.3) and (C.4) mean that 
any of C and D cannot fixate P for r > r^. With tedious arithmetics, these conditions 
boil down to 



/ (G + 1)M^ - G{G + 5)M + 2G{G + 1) \ M - 2G + 2 
^ V G(G + 1)(M- 2)(M- 1) y^-M2-3M + 2' ^ 

where 2 < r < G for G <C M, and (C.5) defines the proper range for /3 and 7 in the 
proposition. ■ 



Appendix D Proof of Proposition 3 

The technique to prove this proposition is identical to Proposition 2, but the calculation 
cannot be done for general //. For our purpose, it is sufficient to show that the institutional 
sanctioning works with SP for some proper /3 and 7. Our strategy to prove is following. 
If institutional sanctioning with SP works for less punishment upon D and more upon P, 
it still works for the case in the proposition. The less and more system can be expressed 
neatly. We replace and Vp in (2) with 



, , Xp ) :— r q ^ G 



(D.l) 



Vp{xc,Xj,,Xp) :=r ^'^t,^^ -Sp-l-/3{l- Sp) — - 7- 

{jr Xjy 

The punishment term for D in (D.l) is smaller than that in (2), that for P is larger. 
If sanctioning works with less effective punishment, this must also hold for more effective 
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punishment. Thus, the range of (3 and 7 with (D.l) to realize the full cooperation is a 
sufficient condition for that with (2). The actual proof proceeds similarly on Appendix C, 
and the condition for /3 and 7 with fi is given by 



.,,/^ G G(G-1)m Y (M-2G + 2)(^-l) 

^ \G-l G(M-l)y^- (M-2)(M-1) ' ^ 
where 2 < r < G for G < M. ■ 



Appendix E Frequency of Cooperative State with Peer 
Sanctioning Applied among C, D, L and CP 

For interactions among C, D, L and CP, the sufficient level of cooperative state can be 
made when following fixations are realized: 1) D fixates C, 2) L fixates D, 3) C and CP 
fixates L, and 4) C and CP are drifted neutrally, and any other fixating relations are not 
possible for proper case of the evolution of cooperation. 

The first is given by 2 < r < G. The second is easily justified when the payoff of 
exit option is higher than that is the payoff of D-homogeneous state. The third can be 
understood when the payoff of full cooperative state is higher than that of exit option. 
The last one merits discussion. In the evolutionary dynamics of "via freedom to coercion" , 
the population stays sufficiently long at cooperative state. This can be ensured by neutral 
drift between C and CP. This drift is successfully defended from D, for P still works 
in the population. When population is fixated hy C, D can prospers, but which state is 
quickly taken over by L again. When above four fixating relations are satisfied, /U-sized 
imitation dynamics makes the transition matrix which is given by 



i) \3J M 





iir ( 


1\M fj, 
3) M 















ar 


1-2(1)^ 





\3) M 







\3) M 



V 

This matrix is irreducible, and its unique invariant distribution is given by the nor- 
malized eigenvector of eigenvalue 1. The frequency of cooperative state yields l^^+M- 
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